212 research outputs found

    New results on a generalized coupon collector problem using Markov chains

    Get PDF
    We study in this paper a generalized coupon collector problem, which consists in determining the distribution and the moments of the time needed to collect a given number of distinct coupons that are drawn from a set of coupons with an arbitrary probability distribution. We suppose that a special coupon called the null coupon can be drawn but never belongs to any collection. In this context, we obtain expressions of the distribution and the moments of this time. We also prove that the almost-uniform distribution, for which all the non-null coupons have the same drawing probability, is the distribution which minimizes the expected time to get a fixed subset of distinct coupons. This optimization result is extended to the complementary distribution of that time when the full collection is considered, proving by the way this well-known conjecture. Finally, we propose a new conjecture which expresses the fact that the almost-uniform distribution should minimize the complementary distribution of the time needed to get any fixed number of distinct coupons.Comment: 14 page

    Sketch *-metric: Comparing Data Streams via Sketching

    Get PDF
    12 pages, double colonnesIn this paper, we consider the problem of estimating the distance between any two large data streams in small- space constraint. This problem is of utmost importance in data intensive monitoring applications where input streams are generated rapidly. These streams need to be processed on the fly and accurately to quickly determine any deviance from nominal behavior. We present a new metric, the Sketch ⋆-metric, which allows to define a distance between updatable summaries (or sketches) of large data streams. An important feature of the Sketch ⋆-metric is that, given a measure on the entire initial data streams, the Sketch ⋆-metric preserves the axioms of the latter measure on the sketch (such as the non-negativity, the identity, the symmetry, the triangle inequality but also specific properties of the f-divergence). Extensive experiments conducted on both synthetic traces and real data allow us to validate the robustness and accuracy of the Sketch ⋆-metric

    A Comparative Study of Rateless Codes for P2P Persistent Storage

    Get PDF
    International audienceThis paper evaluates the performance of two seminal rateless erasure codes, LT Codes and Online Codes. Their properties make them appropriate for coping with communication channels having an unbounded loss rate. They are therefore very well suited to peer-to-peer systems. This evaluation targets two goals. First, it compares the performance of both codes in different adversarial environments and in different application contexts. Second, it helps understanding how the parameters driving the behavior of the coding impact its complexity. To the best of our knowledge, this is the first comprehensive study facilitating application designers in setting the optimal values for the coding parameters to best fit their P2P context

    A framework for proving the self-organization of dynamic systems

    Get PDF
    This paper aims at providing a rigorous definition of self- organization, one of the most desired properties for dynamic systems (e.g., peer-to-peer systems, sensor networks, cooperative robotics, or ad-hoc networks). We characterize different classes of self-organization through liveness and safety properties that both capture information re- garding the system entropy. We illustrate these classes through study cases. The first ones are two representative P2P overlays (CAN and Pas- try) and the others are specific implementations of \Omega (the leader oracle) and one-shot query abstractions for dynamic settings. Our study aims at understanding the limits and respective power of existing self-organized protocols and lays the basis of designing robust algorithm for dynamic systems

    Optimization results for a generalized coupon collector problem

    Get PDF
    We study in this paper a generalized coupon collector problem, which consists in analyzing the time needed to collect a given number of distinct coupons that are drawn from a set of coupons with an arbitrary probability distribution. We suppose that a special coupon called the null coupon can be drawn but never belongs to any collection. In this context, we prove that the almost uniform distribution, for which all the non-null coupons have the same drawing probability, is the distribution which stochastically minimizes the time needed to collect a fixed number of distinct coupons. Moreover, we show that in a given closed subset of probability distributions, the distribution with all its entries, but one, equal to the smallest possible value is the one, which stochastically maximizes the time needed to collect a fixed number of distinct coupons. An computer science application shows the utility of these results.Comment: arXiv admin note: text overlap with arXiv:1402.524

    SQUARE: Scalable Quorum-Based Atomic Memory with Local Reconfiguration

    Get PDF
    International audienceInternet applications require more and more resources to satisfy the unpredictable clients needs. Specifically, such applications must ensure quality of service despite bursts of load. Distributed dynamic self-organized systems present an inherent adaptiveness that can face unpredictable bursts of load. Nevertheless quality of service, and more particularly data consistency, remains hardly achievable in such systems since participants (i.e., nodes) can crash, leave, and join the system at arbitrary time. The atomic consistency guarantees that any read operation returns the last written value of a data and is generalizable to data composition. To guarantee atomicity in message-passing model, mutually intersecting sets (a.k.a.quorums) of nodes are used. The solution presented here, namely SQUARE, provides scalability, load-balancing, fault-tolerance, and self-adaptiveness, while ensuring atomic consistency. We specify our solution, prove it correct and analyse it through simulations. \\ Les applications utilisées via internet nécessitent de plus en plus de ressources afin de satisfaire les besoins imprévisibles des clients. De telles applications doivent assurer une certaine qualité de service en dépit des pics de charge. Les systÚmes distribués dynamiques capable de s'auto-organiser ont une capacité intrinsÚque pour supporter ces pics de charge imprévisibles. Cependant, la qualité de service et plus particuliÚrement la cohérence des données reste trÚs difficile à assurer dans de tels systÚmes. En effet, les participants, ou noeuds, peuvent rejoindre, quitter le systÚme, et tomber en panne de façon arbitraire. La cohérence atomique assure que toute lecture renvoie la derniÚre valeur écrite et la relation de composition la préserve. Afin de garantir l'atomicité dans un modÚle à passage de message, des ensembles de noeuds s'intersectant mutuellement (les quorums) sont utilisés. La solution présentée ici, appelée SQUARE, est exploitable à grande échelle, permet de balancer la charge, tolÚre les pannes et s'auto-adapte tout en assurant l'atomicité. Nous spécifions la solution, la prouvons correcte et la simulons pour en analyser les performances

    Uniform and Ergodic Sampling in Unstructured Peer-to-Peer Systems with Malicious Nodes

    Get PDF
    ISBN: 978-3-642-17652-4International audienceWe consider the problem of uniform sampling in large scale open systems. Uniform sampling is a fundamental schema that guarantees that any individual in a population has the same probability to be selected as sample. An important issue that seriously hampers the feasibility of uniform sampling in open and large scale systems is the inevitable presence of malicious nodes. In this paper we show that restricting the number of requests that malicious nodes can issue and allowing for a full knowledge of the composition of the system is a necessary and sufficient condition to guarantee uniform and ergodic sampling. In a nutshell, a uniform and ergodic sampling guarantees that any node in the system is equally likely to appear as a sample at any non malicious node in the system and that infinitely often any nodes have a non null probability to appear as a sample at any honest nodes

    Analysis of a large number of Markov chains competing for transitions

    Get PDF
    International audienceWe consider the behavior of a stochastic system composed of several identically distributed, but non independent, discrete-time absorbing Markov chains competing at each instant for a transition. The competition consists in determining at each instant, using a given probability distribution, the only Markov chain allowed to make a transition. We analyze the first time at which one of the Markov chains reaches its absorbing state. When the number of Markov chains goes to infinity, we analyze the asymptotic behavior of the system for an arbitrary probability mass function governing the competition. We give conditions for the existence of the asymptotic distribution and we show how these results apply to cluster-based distributed systems when the competition between the Markov chains is handled by using a geometric distribution

    On the Power of the Adversary to Solve the Node Sampling Problem

    Get PDF
    International audienceWe study the problem of achieving uniform and fresh peer sampling in large scale dynamic systems under adversarial behaviors. Briefly, uniform and fresh peer sampling guarantees that any node in the system is equally likely to appear as a sample at any non malicious node in the system and that infinitely often any node has a non-null probability to appear as a sample of honest nodes. This sample is built locally out of a stream of node identifiers received at each node. An important issue that seriously hampers the feasibility of node sampling in open and large scale systems is the unavoidable presence of malicious nodes. The objective of malicious nodes mainly consists in continuously and largely biasing the input data stream out of which samples are obtained, to prevent (honest) nodes from being selected as samples. First we demonstrate that restricting the number of requests that malicious nodes can issue and providing a full knowledge of the composition of the system is a necessary and sufficient condition to guarantee uniform and fresh sampling. We also define and study two types of adversary models: an omniscient adversary that has the capacity to eavesdrop on all the messages that are exchanged within the system, and a blind adversary that can only observe messages that have been sent or received by nodes it controls. The former model allows us to derive lower bounds on the impact that the adversary has on the sampling functionality while the latter one corresponds to a more realistic setting. Given any sampling strategy, we quantify the minimum effort exerted by both types of adversary on any input stream to prevent this sampling strategy from outputting a uniform and fresh sample
    • 

    corecore